Robust mixture modeling using t-distribution: application to speaker ID

نویسندگان

Sundar Harshavardhan

Thippur V. Sreenivas

چکیده

Robust stochastic modeling of speech is an important issue for the performance of practical applications. The Gaussian mixture model, GMM, is widely used in speaker ID, but its performance would get limited in the presence of unseen noise and distortions. Such noisy data, referred to as ”outliers” for the original distribution, can be better represented by the use of heavy-tail distributions, such as Student’s t-distribution. It provides a natural choice in which the heavy-tail can be controlled using the degrees-of-freedom parameter, ν. We explore finite mixture of t-distributions model (tMM), to represent noisy speech data and show its robustness for speaker ID, compared to GMM. Using the TIMIT and NTIMIT databases, the recognition accuracy obtained are 100% and 79.68% with a 34 mixture tMM respectively much better than those reported in the literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text-independent speaker identification using Gaussian mixture bigram models

In this paper, a novel speaker modeling technique based on Gaussian mixture bigram model (GMBM) is introduced and evaluated for text-independent speaker identification (speaker-ID). GMBM is a stochastic framework that explores the context or time dependency of continuous observations from an information source. In view of the fact that speech features are correlated between successive frames, w...

متن کامل

Evaluation and Application of the Gaussian-Log Gaussian Spatial Model for Robust Bayesian Prediction of Tehran Air Pollution Data

Air pollution is one of the major problems of Tehran metropolis. Regarding the fact that Tehran is surrounded by Alborz Mountains from three sides, the pollution due to the cars traffic and other polluting means causes the pollutants to be trapped in the city and have no exit without appropriate wind guff. Carbon monoxide (CO) is one of the most important sources of pollution in Tehran air. The...

متن کامل

Speaker Identification From Youtube Obtained Data

An efficient, and intuitive algorithm is presented for the identification of speakers from a long dataset (like YouTube long discussion, Cocktail party recorded audio or video).The goal of automatic speaker identification is to identify the number of different speakers and prepare a model for that speaker by extraction, characterization and speaker-specific information contained in the speech s...

متن کامل

Robust text-independent speaker identification using Gaussian mixture speaker models

This paper introduces and motivates the use of Gaussian mixture models (CMM) for robust text-independent speaker identification. The individual Gaussian components of a GMM are shown to represent some general speaker-dependent spectral shapes that are efTective for modeling speaker identity. The focus of this work is on applications which require high identification rates using short utterance ...

متن کامل

KL realignment for speaker diarization with multiple feature streams

This paper aims at investigating the use of Kullback-Leibler (KL) divergence based realignment with application to speaker diarization. The use of KL divergence based realignment operates directly on the speaker posterior distribution estimates and is compared with traditional realignment performed using HMM/GMM system. We hypothesize that using posterior estimates to re-align speaker boundarie...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Robust mixture modeling using t-distribution: application to speaker ID

نویسندگان

چکیده

منابع مشابه

Text-independent speaker identification using Gaussian mixture bigram models

Evaluation and Application of the Gaussian-Log Gaussian Spatial Model for Robust Bayesian Prediction of Tehran Air Pollution Data

Speaker Identification From Youtube Obtained Data

Robust text-independent speaker identification using Gaussian mixture speaker models

KL realignment for speaker diarization with multiple feature streams

عنوان ژورنال:

اشتراک گذاری